{
 "cells": [
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Overview**\n",
    "\n",
    "This notebook explains how to run experiments using Sample-Factory, as well as upload and download models from the Hugging Face Hub. We will use OpenAI Gym's Lunar Lander environment as an example.\n",
    "\n",
    "Note that this notebook is for demonstration purposes only. It is not recommended to run custom Sample-Factory experiments in Jupyter Notebooks due to Python Multiprocessisng not working with functions and objects not defined in an imported module. In this example notebook, we use environment functions defined in `sf_examples` to set up the gym environment and parse custom enviornments. Including the `make_gym_env_func` directly in this notebook will result in an error when running the experiment.\n",
    "\n",
    "\n",
    "**Step 1: Install Dependencies**\n",
    "\n",
    "To run this notebook, we need to install both `sample-factory` and the Lunar Lander environment using `pip`. Additional setup is required to use the Hugging Face Hub and instructions can be found at https://www.samplefactory.dev/10-huggingface/huggingface/"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!pip install sample-factory\n",
    "!pip install gym[box2d]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Imports and setup\n",
    "# General imports\n",
    "import gym\n",
    "from typing import Optional\n",
    "from IPython.display import Video\n",
    "\n",
    "# Imports from Sample Factory library\n",
    "from sample_factory.envs.env_utils import register_env\n",
    "from sample_factory.train import run_rl\n",
    "from sample_factory.enjoy import enjoy\n",
    "from sample_factory.huggingface.huggingface_utils import load_from_hf\n",
    "\n",
    "# Imports specific for gym environment from sample factory examples\n",
    "from sf_examples.train_gym_env import parse_custom_args, make_gym_env_func\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Step 2: Create Lunar Lander Environemnt and Specify Training Parameters**\n",
    "\n",
    "First, we need to create the Lunar Lander training environment. We can do so using Sample-Factory's `register_env` to register the environment, along with a custom function to create an environment from its name. For gym environemnts, we can use `make_gym_env_func` from `sf_examples.train_gym_env`.\n",
    "\n",
    "We also need to specify some parameters for our experiment. All experiments need to specify `algo` which is the algorithm used to train, `env` which is the environment we are running on, and `experiment` which is where to save the model after running the experiment.\n",
    "\n",
    "Other training parameters can be specified as well. A full list of parameters can be found by running Sample-Factory with the `--help` flag.\n",
    "\n",
    "See the `sf_examples` folder for examples of how to run experiments using Sample Factory in different environments. Some supported enviornments include Atari, DMLab, MuJoCo, and ViZDoom."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Register Lunar Lander environment\n",
    "register_env(\"LunarLanderContinuous-v2\", make_gym_env_func)\n",
    "\n",
    "# Initialize basic arguments for running the experiment. These parameters are required to run any experiment\n",
    "# The parameters can also be specified in the command line\n",
    "experiment_name = \"lunar_lander_example\"\n",
    "argv = [\"--algo=APPO\", \"--env=LunarLanderContinuous-v2\", f\"--experiment={experiment_name}\"]\n",
    "\n",
    "cfg = parse_custom_args(argv=argv, evaluation=False)\n",
    "\n",
    "# The following parameters can be changed from the default\n",
    "cfg.reward_scale = 0.05\n",
    "cfg.train_for_env_steps = 1000000\n",
    "cfg.gae_lambda = 0.99\n",
    "cfg.num_workers = 20\n",
    "cfg.num_envs_per_worker = 6\n",
    "cfg.seed = 0\n",
    "\n",
    "# Experiments can also be run using CPU only\n",
    "# For best performance, it is recommended to turn on GPU Hardware acceleration in Colab under \"Runtime\" > \"Change Runtime Type\"\n",
    "cfg.device = 'gpu' if torch.cuda.is_available() else 'cpu'"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Step 3: Run Experiment**\n",
    "\n",
    "Next, we train the experiment using the parameters we specified above."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "run_rl(cfg)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Step 4: Evaluating the Model and Uploading to the Hugging Face Hub**\n",
    "\n",
    "After training the model, we can use the `enjoy` function to see how well it did. `enjoy` also allows us to generate a video with the `--save_video` flag, as well as upload the model to the Hub using the `--push_to_hub` flag. Make sure you also specify `--hf_repository` with your Hugging Face username and repository name in the form `<username>/<repo_name>`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "## Change this to your Hugging Face username\n",
    "username = \"andrewzhang505\"\n",
    "\n",
    "enjoy_args = [\"--no_render\", \"--max_num_episodes=5\", \"--push_to_hub\", f\"--hf_repository={username}/{experiment_name}\", \"--save_video\"]\n",
    "cfg = parse_custom_args(argv=argv+enjoy_args, evaluation=True)\n",
    "enjoy(cfg)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "Video(f\"./train_dir/{experiment_name}/replay.mp4\", embed=True)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Step 5: Downloading Models from the Hub**\n",
    "\n",
    "You can also download other models from the Hub for your own use as well. The following will download the model to `./train_dir/sf2-lunar-lander/` and you can use the model by specifying `--experiment=sf2-lunar-lander`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "load_from_hf(\"./train_dir\", \"andrewzhang505/sf2-lunar-lander\")\n",
    "\n",
    "download_args = [\"--algo=APPO\", \"--env=LunarLanderContinuous-v2\", \"--experiment=sf2-lunar-lander\", \"--no_render\", \"--max_num_episodes=1\", \"--save_video\"]\n",
    "cfg = parse_custom_args(argv=download_args, evaluation=True)\n",
    "enjoy(cfg)\n",
    "\n",
    "Video(f\"./train_dir/sf2-lunar-lander/replay.mp4\", embed=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Additional Resources**\n",
    "\n",
    "For more information on using Sample-Factory, check out our website at https://www.samplefactory.dev/ and our github at https://github.com/alex-petrenko/sample-factory/"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3.9.13 ('test')",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.13"
  },
  "orig_nbformat": 4,
  "vscode": {
   "interpreter": {
    "hash": "bd5eb279d10053843772e5b13836afa1489cd82e3492b360393a155ea89d875e"
   }
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
